Overview

Dataset statistics

Number of variables17
Number of observations22699
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.5 MiB
Average record size in memory302.6 B

Variable types

Categorical4
DateTime2
Numeric10
Boolean1

Alerts

RatecodeID is highly overall correlated with tolls_amountHigh correlation
extra is highly overall correlated with improvement_surcharge and 1 other fieldsHigh correlation
fare_amount is highly overall correlated with total_amount and 1 other fieldsHigh correlation
improvement_surcharge is highly overall correlated with extra and 1 other fieldsHigh correlation
mta_tax is highly overall correlated with extra and 1 other fieldsHigh correlation
tip_amount is highly overall correlated with total_amountHigh correlation
tolls_amount is highly overall correlated with RatecodeIDHigh correlation
total_amount is highly overall correlated with fare_amount and 2 other fieldsHigh correlation
trip_distance is highly overall correlated with fare_amount and 1 other fieldsHigh correlation
store_and_fwd_flag is highly imbalanced (96.0%)Imbalance
payment_type is highly imbalanced (51.5%)Imbalance
mta_tax is highly imbalanced (97.2%)Imbalance
improvement_surcharge is highly imbalanced (99.3%)Imbalance
RatecodeID is highly skewed (γ1 = 117.1412088)Skewed
fare_amount is highly skewed (γ1 = 21.66310069)Skewed
total_amount is highly skewed (γ1 = 20.38940334)Skewed
extra has 11921 (52.5%) zerosZeros
tip_amount has 8057 (35.5%) zerosZeros
tolls_amount has 21525 (94.8%) zerosZeros

Reproduction

Analysis started2025-12-03 20:37:10.866795
Analysis finished2025-12-03 20:37:24.474008
Duration13.61 seconds
Software versionydata-profiling vv4.18.0
Download configurationconfig.json

Variables

VendorID
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
2
12626 
1
10073 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters22699
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row2
5th row2

Common Values

ValueCountFrequency (%)
212626
55.6%
110073
44.4%

Length

2025-12-03T17:37:24.523556image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-12-03T17:37:24.562919image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
ValueCountFrequency (%)
212626
55.6%
110073
44.4%

Most occurring characters

ValueCountFrequency (%)
212626
55.6%
110073
44.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)22699
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
212626
55.6%
110073
44.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)22699
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
212626
55.6%
110073
44.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)22699
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
212626
55.6%
110073
44.4%
Distinct22687
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size177.5 KiB
Minimum2017-01-01 00:08:25
Maximum2017-12-31 23:45:30
Invalid dates0
Invalid dates (%)0.0%
2025-12-03T17:37:24.608296image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:24.673986image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct22688
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size177.5 KiB
Minimum2017-01-01 00:17:20
Maximum2017-12-31 23:49:24
Invalid dates0
Invalid dates (%)0.0%
2025-12-03T17:37:24.738591image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:24.808379image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

passenger_count
Real number (ℝ)

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.642319
Minimum0
Maximum6
Zeros33
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size177.5 KiB
2025-12-03T17:37:24.859795image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q32
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.2852311
Coefficient of variation (CV)0.78257092
Kurtosis3.7105074
Mean1.642319
Median Absolute Deviation (MAD)0
Skewness2.172872
Sum37279
Variance1.651819
MonotonicityNot monotonic
2025-12-03T17:37:24.898383image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
116117
71.0%
23305
 
14.6%
51143
 
5.0%
3953
 
4.2%
6693
 
3.1%
4455
 
2.0%
033
 
0.1%
ValueCountFrequency (%)
033
 
0.1%
116117
71.0%
23305
 
14.6%
3953
 
4.2%
4455
 
2.0%
51143
 
5.0%
6693
 
3.1%
ValueCountFrequency (%)
6693
 
3.1%
51143
 
5.0%
4455
 
2.0%
3953
 
4.2%
23305
 
14.6%
116117
71.0%
033
 
0.1%

trip_distance
Real number (ℝ)

High correlation 

Distinct1545
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.9133129
Minimum0
Maximum33.96
Zeros148
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size177.5 KiB
2025-12-03T17:37:24.951390image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.49
Q10.99
median1.61
Q33.06
95-th percentile10.531
Maximum33.96
Range33.96
Interquartile range (IQR)2.07

Descriptive statistics

Standard deviation3.6531712
Coefficient of variation (CV)1.2539577
Kurtosis10.4106
Mean2.9133129
Median Absolute Deviation (MAD)0.81
Skewness2.9949129
Sum66129.29
Variance13.34566
MonotonicityNot monotonic
2025-12-03T17:37:25.031739image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1531
 
2.3%
0.9507
 
2.2%
0.8497
 
2.2%
1.1490
 
2.2%
0.7469
 
2.1%
1.2451
 
2.0%
0.6427
 
1.9%
1.3423
 
1.9%
1.4392
 
1.7%
1.5367
 
1.6%
Other values (1535)18145
79.9%
ValueCountFrequency (%)
0148
0.7%
0.017
 
< 0.1%
0.0211
 
< 0.1%
0.034
 
< 0.1%
0.044
 
< 0.1%
0.051
 
< 0.1%
0.063
 
< 0.1%
0.075
 
< 0.1%
0.083
 
< 0.1%
0.091
 
< 0.1%
ValueCountFrequency (%)
33.961
< 0.1%
33.921
< 0.1%
32.721
< 0.1%
31.951
< 0.1%
30.831
< 0.1%
30.51
< 0.1%
30.331
< 0.1%
28.231
< 0.1%
28.21
< 0.1%
27.971
< 0.1%

RatecodeID
Real number (ℝ)

High correlation  Skewed 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.043394
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size177.5 KiB
2025-12-03T17:37:25.079620image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile1
Maximum99
Range98
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.70839088
Coefficient of variation (CV)0.67892943
Kurtosis16112.97
Mean1.043394
Median Absolute Deviation (MAD)0
Skewness117.14121
Sum23684
Variance0.50181765
MonotonicityNot monotonic
2025-12-03T17:37:25.125144image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
122070
97.2%
2513
 
2.3%
568
 
0.3%
339
 
0.2%
48
 
< 0.1%
991
 
< 0.1%
ValueCountFrequency (%)
122070
97.2%
2513
 
2.3%
339
 
0.2%
48
 
< 0.1%
568
 
0.3%
991
 
< 0.1%
ValueCountFrequency (%)
991
 
< 0.1%
568
 
0.3%
48
 
< 0.1%
339
 
0.2%
2513
 
2.3%
122070
97.2%

store_and_fwd_flag
Boolean

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.3 KiB
False
22600 
True
 
99
ValueCountFrequency (%)
False22600
99.6%
True99
 
0.4%
2025-12-03T17:37:25.155641image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/

PULocationID
Real number (ℝ)

Distinct152
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean162.41235
Minimum1
Maximum265
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size177.5 KiB
2025-12-03T17:37:25.196939image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile48
Q1114
median162
Q3233
95-th percentile261
Maximum265
Range264
Interquartile range (IQR)119

Descriptive statistics

Standard deviation66.633373
Coefficient of variation (CV)0.41027282
Kurtosis-0.89920699
Mean162.41235
Median Absolute Deviation (MAD)67
Skewness-0.25777668
Sum3686598
Variance4440.0064
MonotonicityNot monotonic
2025-12-03T17:37:25.261396image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
237890
 
3.9%
161861
 
3.8%
186792
 
3.5%
236785
 
3.5%
162779
 
3.4%
234749
 
3.3%
170749
 
3.3%
48741
 
3.3%
230739
 
3.3%
142649
 
2.9%
Other values (142)14965
65.9%
ValueCountFrequency (%)
13
 
< 0.1%
460
 
0.3%
737
 
0.2%
101
 
< 0.1%
129
 
< 0.1%
13227
1.0%
142
 
< 0.1%
176
 
< 0.1%
2462
 
0.3%
2525
 
0.1%
ValueCountFrequency (%)
26514
 
0.1%
264345
1.5%
263392
1.7%
262259
1.1%
261130
 
0.6%
26021
 
0.1%
2581
 
< 0.1%
25614
 
0.1%
25533
 
0.1%
249483
2.1%

DOLocationID
Real number (ℝ)

Distinct216
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean161.528
Minimum1
Maximum265
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size177.5 KiB
2025-12-03T17:37:25.323179image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile43
Q1112
median162
Q3233
95-th percentile257.1
Maximum265
Range264
Interquartile range (IQR)121

Descriptive statistics

Standard deviation70.139691
Coefficient of variation (CV)0.43422622
Kurtosis-0.94501782
Mean161.528
Median Absolute Deviation (MAD)68
Skewness-0.32848288
Sum3666524
Variance4919.5762
MonotonicityNot monotonic
2025-12-03T17:37:25.390951image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
161858
 
3.8%
236802
 
3.5%
230761
 
3.4%
237759
 
3.3%
170699
 
3.1%
162681
 
3.0%
234661
 
2.9%
186653
 
2.9%
48619
 
2.7%
142612
 
2.7%
Other values (206)15594
68.7%
ValueCountFrequency (%)
134
 
0.1%
4101
0.4%
789
 
0.4%
92
 
< 0.1%
106
 
< 0.1%
112
 
< 0.1%
1224
 
0.1%
13230
1.0%
1419
 
0.1%
153
 
< 0.1%
ValueCountFrequency (%)
26560
 
0.3%
264304
1.3%
263369
1.6%
262261
1.1%
261117
 
0.5%
26019
 
0.1%
2593
 
< 0.1%
2582
 
< 0.1%
25717
 
0.1%
25655
 
0.2%

payment_type
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
1
15265 
2
7267 
3
 
121
4
 
46

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters22699
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row2

Common Values

ValueCountFrequency (%)
115265
67.2%
27267
32.0%
3121
 
0.5%
446
 
0.2%

Length

2025-12-03T17:37:25.447098image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-12-03T17:37:25.484802image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
ValueCountFrequency (%)
115265
67.2%
27267
32.0%
3121
 
0.5%
446
 
0.2%

Most occurring characters

ValueCountFrequency (%)
115265
67.2%
27267
32.0%
3121
 
0.5%
446
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)22699
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
115265
67.2%
27267
32.0%
3121
 
0.5%
446
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)22699
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
115265
67.2%
27267
32.0%
3121
 
0.5%
446
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)22699
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
115265
67.2%
27267
32.0%
3121
 
0.5%
446
 
0.2%

fare_amount
Real number (ℝ)

High correlation  Skewed 

Distinct185
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.026629
Minimum-120
Maximum999.99
Zeros6
Zeros (%)< 0.1%
Negative14
Negative (%)0.1%
Memory size177.5 KiB
2025-12-03T17:37:25.532409image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/

Quantile statistics

Minimum-120
5-th percentile4.5
Q16.5
median9.5
Q314.5
95-th percentile36
Maximum999.99
Range1119.99
Interquartile range (IQR)8

Descriptive statistics

Standard deviation13.243791
Coefficient of variation (CV)1.0166706
Kurtosis1420.1897
Mean13.026629
Median Absolute Deviation (MAD)3.5
Skewness21.663101
Sum295691.46
Variance175.39799
MonotonicityNot monotonic
2025-12-03T17:37:25.594052image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
61163
 
5.1%
6.51089
 
4.8%
5.51081
 
4.8%
71067
 
4.7%
7.51018
 
4.5%
5994
 
4.4%
8.5984
 
4.3%
8947
 
4.2%
9885
 
3.9%
9.5869
 
3.8%
Other values (175)12602
55.5%
ValueCountFrequency (%)
-1201
 
< 0.1%
-4.52
 
< 0.1%
-42
 
< 0.1%
-3.53
 
< 0.1%
-32
 
< 0.1%
-2.54
 
< 0.1%
06
 
< 0.1%
0.012
 
< 0.1%
11
 
< 0.1%
2.5104
0.5%
ValueCountFrequency (%)
999.991
< 0.1%
4501
< 0.1%
200.011
< 0.1%
2001
< 0.1%
1751
< 0.1%
1521
< 0.1%
1501
< 0.1%
1401
< 0.1%
1311
< 0.1%
1202
< 0.1%

extra
Real number (ℝ)

High correlation  Zeros 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.33327459
Minimum-1
Maximum4.5
Zeros11921
Zeros (%)52.5%
Negative9
Negative (%)< 0.1%
Memory size177.5 KiB
2025-12-03T17:37:25.643240image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile0
Q10
median0
Q30.5
95-th percentile1
Maximum4.5
Range5.5
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation0.46309658
Coefficient of variation (CV)1.3895346
Kurtosis27.000218
Mean0.33327459
Median Absolute Deviation (MAD)0
Skewness3.5250157
Sum7565
Variance0.21445844
MonotonicityNot monotonic
2025-12-03T17:37:25.681061image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
011921
52.5%
0.57104
31.3%
13564
 
15.7%
4.5101
 
0.4%
-0.57
 
< 0.1%
-12
 
< 0.1%
ValueCountFrequency (%)
-12
 
< 0.1%
-0.57
 
< 0.1%
011921
52.5%
0.57104
31.3%
13564
 
15.7%
4.5101
 
0.4%
ValueCountFrequency (%)
4.5101
 
0.4%
13564
 
15.7%
0.57104
31.3%
011921
52.5%
-0.57
 
< 0.1%
-12
 
< 0.1%

mta_tax
Categorical

High correlation  Imbalance 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0.5
22596 
0.0
 
90
-0.5
 
13

Length

Max length4
Median length3
Mean length3.0005727
Min length3

Characters and Unicode

Total characters68110
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.5
2nd row0.5
3rd row0.5
4th row0.5
5th row0.5

Common Values

ValueCountFrequency (%)
0.522596
99.5%
0.090
 
0.4%
-0.513
 
0.1%

Length

2025-12-03T17:37:25.728288image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-12-03T17:37:25.760875image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
ValueCountFrequency (%)
0.522609
99.6%
0.090
 
0.4%

Most occurring characters

ValueCountFrequency (%)
022789
33.5%
.22699
33.3%
522609
33.2%
-13
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)68110
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
022789
33.5%
.22699
33.3%
522609
33.2%
-13
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)68110
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
022789
33.5%
.22699
33.3%
522609
33.2%
-13
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)68110
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
022789
33.5%
.22699
33.3%
522609
33.2%
-13
 
< 0.1%

tip_amount
Real number (ℝ)

High correlation  Zeros 

Distinct742
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.8357813
Minimum0
Maximum200
Zeros8057
Zeros (%)35.5%
Negative0
Negative (%)0.0%
Memory size177.5 KiB
2025-12-03T17:37:25.804822image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1.35
Q32.45
95-th percentile6.35
Maximum200
Range200
Interquartile range (IQR)2.45

Descriptive statistics

Standard deviation2.8006263
Coefficient of variation (CV)1.5255773
Kurtosis1124.3261
Mean1.8357813
Median Absolute Deviation (MAD)1.35
Skewness18.188305
Sum41670.4
Variance7.8435075
MonotonicityNot monotonic
2025-12-03T17:37:25.866565image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
08057
35.5%
11451
 
6.4%
2756
 
3.3%
1.5303
 
1.3%
3237
 
1.0%
1.66222
 
1.0%
1.45210
 
0.9%
1.36205
 
0.9%
1.55202
 
0.9%
1.26202
 
0.9%
Other values (732)10854
47.8%
ValueCountFrequency (%)
08057
35.5%
0.018
 
< 0.1%
0.024
 
< 0.1%
0.031
 
< 0.1%
0.041
 
< 0.1%
0.071
 
< 0.1%
0.081
 
< 0.1%
0.15
 
< 0.1%
0.121
 
< 0.1%
0.151
 
< 0.1%
ValueCountFrequency (%)
2001
< 0.1%
55.51
< 0.1%
51.641
< 0.1%
46.691
< 0.1%
42.291
< 0.1%
281
< 0.1%
25.22
< 0.1%
251
< 0.1%
22.221
< 0.1%
21.31
< 0.1%

tolls_amount
Real number (ℝ)

High correlation  Zeros 

Distinct38
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.31254152
Minimum0
Maximum19.1
Zeros21525
Zeros (%)94.8%
Negative0
Negative (%)0.0%
Memory size177.5 KiB
2025-12-03T17:37:25.934838image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile5.54
Maximum19.1
Range19.1
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.3992119
Coefficient of variation (CV)4.4768833
Kurtosis31.865134
Mean0.31254152
Median Absolute Deviation (MAD)0
Skewness5.0827272
Sum7094.38
Variance1.957794
MonotonicityNot monotonic
2025-12-03T17:37:26.004050image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
021525
94.8%
5.76847
 
3.7%
5.54239
 
1.1%
10.521
 
0.1%
12.511
 
< 0.1%
2.6410
 
< 0.1%
2.546
 
< 0.1%
11.523
 
< 0.1%
16.263
 
< 0.1%
16.52
 
< 0.1%
Other values (28)32
 
0.1%
ValueCountFrequency (%)
021525
94.8%
2.161
 
< 0.1%
2.546
 
< 0.1%
2.6410
 
< 0.1%
2.71
 
< 0.1%
4.321
 
< 0.1%
5.161
 
< 0.1%
5.441
 
< 0.1%
5.451
 
< 0.1%
5.491
 
< 0.1%
ValueCountFrequency (%)
19.11
 
< 0.1%
18.281
 
< 0.1%
18.261
 
< 0.1%
182
< 0.1%
17.51
 
< 0.1%
17.281
 
< 0.1%
16.621
 
< 0.1%
16.52
< 0.1%
16.263
< 0.1%
16.21
 
< 0.1%

improvement_surcharge
Categorical

High correlation  Imbalance 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0.3
22679 
-0.3
 
14
0.0
 
6

Length

Max length4
Median length3
Mean length3.0006168
Min length3

Characters and Unicode

Total characters68111
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.3
2nd row0.3
3rd row0.3
4th row0.3
5th row0.3

Common Values

ValueCountFrequency (%)
0.322679
99.9%
-0.314
 
0.1%
0.06
 
< 0.1%

Length

2025-12-03T17:37:26.075983image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-12-03T17:37:26.107813image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
ValueCountFrequency (%)
0.322693
> 99.9%
0.06
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
022705
33.3%
.22699
33.3%
322693
33.3%
-14
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)68111
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
022705
33.3%
.22699
33.3%
322693
33.3%
-14
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)68111
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
022705
33.3%
.22699
33.3%
322693
33.3%
-14
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)68111
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
022705
33.3%
.22699
33.3%
322693
33.3%
-14
 
< 0.1%

total_amount
Real number (ℝ)

High correlation  Skewed 

Distinct1369
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.310502
Minimum-120.3
Maximum1200.29
Zeros4
Zeros (%)< 0.1%
Negative14
Negative (%)0.1%
Memory size177.5 KiB
2025-12-03T17:37:26.151500image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/

Quantile statistics

Minimum-120.3
5-th percentile5.8
Q18.75
median11.8
Q317.8
95-th percentile46.06
Maximum1200.29
Range1320.59
Interquartile range (IQR)9.05

Descriptive statistics

Standard deviation16.097295
Coefficient of variation (CV)0.98692824
Kurtosis1321.9239
Mean16.310502
Median Absolute Deviation (MAD)4
Skewness20.389403
Sum370232.09
Variance259.12292
MonotonicityNot monotonic
2025-12-03T17:37:26.231524image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.3541
 
2.4%
7.8531
 
2.3%
6.8524
 
2.3%
8.3519
 
2.3%
8.8492
 
2.2%
10.3464
 
2.0%
9.3458
 
2.0%
6.3447
 
2.0%
5.8420
 
1.9%
9.8406
 
1.8%
Other values (1359)17897
78.8%
ValueCountFrequency (%)
-120.31
 
< 0.1%
-5.82
< 0.1%
-5.32
< 0.1%
-4.82
< 0.1%
-4.33
< 0.1%
-3.83
< 0.1%
-3.31
 
< 0.1%
04
< 0.1%
0.31
 
< 0.1%
0.311
 
< 0.1%
ValueCountFrequency (%)
1200.291
< 0.1%
450.31
< 0.1%
258.211
< 0.1%
233.741
< 0.1%
211.81
< 0.1%
179.061
< 0.1%
157.061
< 0.1%
152.31
< 0.1%
151.821
< 0.1%
150.31
< 0.1%

Interactions

2025-12-03T17:37:23.686425image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:18.730090image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.280181image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.851619image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.373543image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.896080image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.545074image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.064548image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.569237image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.121167image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.738725image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:18.796288image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.333880image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.909329image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.424309image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.946845image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.594532image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.116573image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.628424image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.174888image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.792762image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:18.852772image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.389187image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.961421image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.476981image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.999694image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.648964image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.169451image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.683436image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.236151image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.847942image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:18.904940image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.440840image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.008880image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.526644image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.049771image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.697060image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.216367image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.735331image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.291307image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.902148image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:18.959362image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.493498image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.058838image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.579406image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.099694image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.744770image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.270053image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.788420image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.345227image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.953316image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.009480image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.558600image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.108152image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.628063image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.148081image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.793368image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.318425image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.839509image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.400143image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:24.003760image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.062364image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.621335image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.158425image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.677537image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.202550image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.841520image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.367560image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.899900image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.453403image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:24.053484image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.113832image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.674724image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.206720image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.726685image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.389084image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.889526image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.414166image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.951206image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.511063image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:24.107304image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.168504image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.731085image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.265289image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.778658image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.441789image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.949289image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.467763image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.009032image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.569766image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:24.178311image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.227290image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:19.788714image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.318951image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:20.835254image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:21.495803image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.011976image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:22.520280image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.067012image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
2025-12-03T17:37:23.626587image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/

Correlations

2025-12-03T17:37:26.286618image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
DOLocationIDPULocationIDRatecodeIDVendorIDextrafare_amountimprovement_surchargemta_taxpassenger_countpayment_typestore_and_fwd_flagtip_amounttolls_amounttotal_amounttrip_distance
DOLocationID1.0000.100-0.0260.038-0.023-0.0910.0220.0990.0010.0240.017-0.023-0.035-0.088-0.101
PULocationID0.1001.000-0.0500.022-0.015-0.0870.0220.016-0.0090.0270.000-0.024-0.075-0.085-0.089
RatecodeID-0.026-0.0501.0000.000-0.0700.2720.0000.0000.0200.0000.0000.1140.5100.2720.209
VendorID0.0380.0220.0001.0000.0130.0000.0250.0200.2780.0820.0730.0000.0050.0060.012
extra-0.023-0.015-0.0700.0131.000-0.0040.5670.5890.0040.1790.0110.039-0.0470.0690.038
fare_amount-0.091-0.0870.2720.000-0.0041.0000.1880.2020.0280.0840.0000.3940.3620.9780.912
improvement_surcharge0.0220.0220.0000.0250.5670.1881.0000.6980.0000.2280.0000.0000.0000.0150.000
mta_tax0.0990.0160.0000.0200.5890.2020.6981.0000.0140.2180.0000.1080.4610.1710.124
passenger_count0.001-0.0090.0200.2780.0040.0280.0000.0141.0000.0230.020-0.0200.0140.0240.039
payment_type0.0240.0270.0000.0820.1790.0840.2280.2180.0231.0000.0120.0000.0420.1030.027
store_and_fwd_flag0.0170.0000.0000.0730.0110.0000.0000.0000.0200.0121.0000.0000.0000.0000.016
tip_amount-0.023-0.0240.1140.0000.0390.3940.0000.108-0.0200.0000.0001.0000.2110.5320.375
tolls_amount-0.035-0.0750.5100.005-0.0470.3620.0000.4610.0140.0420.0000.2111.0000.3720.351
total_amount-0.088-0.0850.2720.0060.0690.9780.0150.1710.0240.1030.0000.5320.3721.0000.897
trip_distance-0.101-0.0890.2090.0120.0380.9120.0000.1240.0390.0270.0160.3750.3510.8971.000

Missing values

2025-12-03T17:37:24.281695image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
A simple visualization of nullity by column.
2025-12-03T17:37:24.379543image/svg+xmlMatplotlib v3.10.7, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

VendorIDtpep_pickup_datetimetpep_dropoff_datetimepassenger_counttrip_distanceRatecodeIDstore_and_fwd_flagPULocationIDDOLocationIDpayment_typefare_amountextramta_taxtip_amounttolls_amountimprovement_surchargetotal_amount
0203/25/2017 8:55:43 AM03/25/2017 9:09:47 AM63.341N100231113.00.00.52.760.00.316.56
1104/11/2017 2:53:28 PM04/11/2017 3:19:58 PM11.801N18643116.00.00.54.000.00.320.80
2112/15/2017 7:26:56 AM12/15/2017 7:34:08 AM11.001N26223616.50.00.51.450.00.38.75
3205/07/2017 1:17:59 PM05/07/2017 1:48:14 PM13.701N18897120.50.00.56.390.00.327.69
4204/15/2017 11:32:20 PM04/15/2017 11:49:03 PM14.371N4112216.50.50.50.000.00.317.80
5203/25/2017 8:34:11 PM03/25/2017 8:42:11 PM62.301N16123619.00.50.52.060.00.312.36
6205/03/2017 7:04:09 PM05/03/2017 8:03:47 PM112.831N79241147.51.00.59.860.00.359.16
7208/15/2017 5:41:06 PM08/15/2017 6:03:05 PM12.981N237114116.01.00.51.780.00.319.58
8202/04/2017 4:17:07 PM02/04/2017 4:29:14 PM11.201N23424929.00.00.50.000.00.39.80
9111/10/2017 3:20:29 PM11/10/2017 3:40:55 PM11.601N239237113.00.00.52.750.00.316.55
VendorIDtpep_pickup_datetimetpep_dropoff_datetimepassenger_counttrip_distanceRatecodeIDstore_and_fwd_flagPULocationIDDOLocationIDpayment_typefare_amountextramta_taxtip_amounttolls_amountimprovement_surchargetotal_amount
22689203/07/2017 12:25:52 PM03/07/2017 12:39:40 PM11.961N11313111.00.00.52.360.000.314.16
22690209/21/2017 1:44:42 PM09/21/2017 1:52:06 PM10.891N4314217.00.00.51.950.000.39.75
22691201/06/2017 1:50:14 AM01/06/2017 1:56:47 AM12.121N1707918.00.50.50.000.000.39.30
22692107/16/2017 3:22:51 AM07/16/2017 3:40:52 AM15.701N24917119.00.50.54.050.000.324.35
22693208/10/2017 10:20:04 PM08/10/2017 10:29:31 PM10.891N22917017.50.50.51.760.000.310.56
22694202/24/2017 5:37:23 PM02/24/2017 5:40:39 PM30.611N4818624.01.00.50.000.000.35.80
22695208/06/2017 4:43:59 PM08/06/2017 5:24:47 PM116.712N132164152.00.00.514.645.760.373.20
22696209/04/2017 2:54:14 PM09/04/2017 2:58:22 PM10.421N10723424.50.00.50.000.000.35.30
22697207/15/2017 12:56:30 PM07/15/2017 1:08:26 PM12.361N68144110.50.00.51.700.000.313.00
22698103/02/2017 1:02:49 PM03/02/2017 1:16:09 PM12.101N239236111.00.00.52.350.000.314.15